feat(mcp): tenant_id scoping on search + add_drawer#1269
Open
funguf wants to merge 1 commit intoMemPalace:developfrom
Open
feat(mcp): tenant_id scoping on search + add_drawer#1269funguf wants to merge 1 commit intoMemPalace:developfrom
funguf wants to merge 1 commit intoMemPalace:developfrom
Conversation
mempalace_search and mempalace_add_drawer now accept an optional
tenant_id. When supplied, add_drawer:
- persists tenant_id on the drawer's chromadb metadata
- salts the drawer-id hash with tenant_id so two tenants writing the
same (wing, room, content) no longer collide on the un-tenanted
sha256 — a real cross-tenant data-clobber that a shared mempalace
sidecar would otherwise hit silently
search forwards tenant_id to chromadb's `where` filter so a tenant-
scoped query only sees its own drawers; the sqlite/BM25 fallback path
applies the same filter in Python after FTS5 candidate selection.
Calls without tenant_id keep the legacy hash, write no tenant
metadata, and run unscoped — pre-multitenant deployments are
bit-compatible.
Motivated by the chirpa-pool ↔ mempalace_sidecar contract: chirpa-pool
has been sending tenant_id on every /search and /add since its v1.2
client work, but the dispatcher's whitelist (mcp_server.py:~1854)
silently dropped the field because no schema declared it. The
chirpa-side tenant scoping was effectively a no-op end-to-end. With
this commit the field round-trips into chromadb metadata and the
filter takes effect.
Bumps to 3.3.4 (matches the version pinned in chirpa's
docs/mempalace_sidecar_contract.md as the minimum).
Follow-up not in this commit: a backfill helper that stamps
tenant_id onto pre-existing un-tenanted drawers so they remain
visible to a tenant-scoped query after a single-tenant install
flips MEMPALACE_TENANT_STRICT=true. Without backfill, a tenant-
scoped query on a legacy palace would silently exclude all prior
drawers — operators must either run such a backfill or keep
TENANT_STRICT=false until the upstream migration helper lands.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds optional
tenant_idtomempalace_searchandmempalace_add_drawerso a shared mempalace instance can serve multiple tenants without cross-tenant data exposure or drawer-id collisions.add_drawer(whentenant_idis supplied):tenant_idon the drawer's chromadb metadatatenant_idso two tenants writing identical(wing, room, content)no longer clobber each other on the un-tenanted SHA-256search(whentenant_idis supplied):wherefilter so the query only sees that tenant's drawers_bm25_only_via_sqlite, used when HNSW is in the HNSW max_elements frozen at 16K while collection grows to 100K+ entries — MCP server segfaults on every tool call #1222 disabled state) applies the same filter in Python after FTS5 candidate selectionCalls without
tenant_idkeep the legacy hash, write no tenant metadata, and run unscoped — pre-multitenant deployments are bit-compatible.Why now
The
mempalace_sidecarHTTP service forwardstenant_idfrom chirpa-pool on every/searchand/add, but the dispatcher's schema-whitelist (mcp_server.py:~1854) silently dropped the field because no schema declared it. The chirpa-side tenant scoping was effectively a no-op end-to-end. With this PR the field round-trips into chromadb metadata and the filter takes effect.Bumps version to
3.3.4(matches the minimum pinned in chirpa-pool'sdocs/mempalace_sidecar_contract.md).Tests
11 new tests, 5 mutating/asserting:
test_add_drawer_with_tenant_id_persists_metadata— round-trip into chromadb metadatatest_add_drawer_tenant_isolation_breaks_cross_tenant_dedup— two tenants with same(wing, room, content)get distinct drawer idstest_add_drawer_no_tenant_keeps_legacy_hash— pre-3.3.4 callers keep bit-compatible drawer idstest_tenant_id_scopes_results— vector-path: tenant-A's search excludes tenant-B's drawerstest_no_tenant_id_returns_unscoped— un-scoped queries unchangedPlus
test_tenant_id_in_filters_fieldand the*_filters_in_resultregression tests for the result envelope.Full suite: 1467 passed, 1 skipped (the same skip exists on develop). Ruff check + format clean.
Follow-up not in this PR
A backfill helper (
python -m mempalace.migrate_tenant) that stampstenant_idonto pre-existing un-tenanted drawers. Without it, an operator who flipsMEMPALACE_TENANT_STRICT=true(sidecar contract) on a legacy palace would see a tenant-scoped query silently exclude every drawer written before this change. Single-tenant deployments today should keepTENANT_STRICT=falseor run a one-off backfill before flipping.Test plan
python -m pytest tests/ --ignore=tests/benchmarks— 1467 passedruff checkcleanruff format --checkcleantool_add_drawerfor the salted-hash behaviour and confirm the legacy-compat branch produces the same drawer id as 3.3.3 for un-tenanted callers.🤖 Generated with Claude Code